Moving from SQL to MongoDB’s MQL

Comments 0

Share to social media

This article is part of Robert Sheldon's continuing series on Mongo DB. To see all of the items in the series, click here.

People new to MongoDB often come to the platform with a background in relational database management systems (RDBMSs) such as MySQL or SQL Server. These platforms commonly use some form of the Structured Query Language (SQL) to manage and query the data within their databases. SQL was designed specifically to work with relational databases and manage their structures and data.

MongoDB is not a relational database system, nor does it use SQL. Instead, it relies on the MongoDB Query Language (MQL), a proprietary language for working with the MongoDB environment. MQL uses a JSON-like syntax to interact with database structures and their data, making possible to create collections, add documents, update data, and do much more.

Throughout this series, you’ve seen plenty of examples of MQL in action as you learned how to use the language to manage and query MongoDB data. However, some of you who come to MongoDB with a SQL background could find it challenging to transition to MQL because the two languages are conceptually different in many way. In such a case, you might find it helpful to see a comparison between common queries in both a relational database and a MongoDB database.

In this article, I compare a number basic SQL statements to their MQL counterparts in MongoDB. I’m using MySQL for the SQL side of the comparison, in part because it is a popular platform, but also because I already had it installed on my system.

Through these comparisons, you’ll be able to see how common operations performed in a relational database can be performed in MongoDB. This might help you better conceptualize how MongoDB works and how to carry out specific tasks. If you have access to both a MySQL instance and MongoDB deployment, you can try out the examples in this article so you can experience first-hand how the two languages compare.

You can use a RDBMS other than MySQL, but you’ll likely need to tweak the SQL statements to accommodate the differences between platforms. For example, when you create a table in MySQL, you can include the AUTO_INCREMENT attribute to generate a unique identity for new rows. In SQL Server, however, you would use the IDENTITY property in your column definition. With this in mind, let’s look how SQL and MQL compare.

Note: For the examples in this article, I used the same MongoDB Atlas environment I used for the previous articles in this series. Refer to the first article for details about setting up these environments. The examples are based on the hr database and employees collection. If you plan to try out these examples, the database and collection should not exist or, if they do exist, should not contain documents or schema validation rules. The examples in this article also include SQL statements, which were written against a MySQL instance, as I’ve already noted.

Comparing a relational database to a MongoDB database

Moving from a RDBMS to MongoDB requires a shift in thinking about how data is stored, managed, and accessed. In a relational database, the data is highly normalized and very structured to minimize redundancy and ensure data integrity. MongoDB is much more flexible. The schema is defined at the document level and can be different from one document to the next, even within the same collection.

Despite these differences, the main components within these two systems correspond with each other to some degree, which can help you better understand the differences and similarities between them. The following table shows how these components map to each other across the two systems.

Relational platform

MongoDB

database

database

table

collection

row

document

column

field

primary key

primary key

index

index

A database in a RDBMS stores data in highly structured tables with fixed schemas. A table’s schema applies to all rows in that table. The schema is typically defined when creating the table, although it can be later modified, as long as the modification doesn’t impact the existing data. Each table contains one or more columns that are defined with specific data types. A table also contains zero or more rows. A row is a collection of related data that describes a specific entity or event. All rows in a table have an identical structure that is based on the columns defined for that table.

A MongoDB database is a logical container that stores data in collections, with each collection containing zero or more documents. Unlike a relational table, a MongoDB collection does not impose a fixed structure on the individuals documents within that collection. Each document can contain one or more fields, and those fields can be defined without regard to the other documents in the collection. In fact, the documents in a collection do not even need to be related to each other, although they typically are.

It is possible to define validation rules on a MongoDB collection that enforce specific document schema, but that is an optional choice, not a requirement, as it is with a relational table.

Both relational tables and MongoDB collections support the use of primary keys. In a relational database, you can define a primary key on one or more columns in any table. The primary key ensures that each row in the table can be uniquely identified. Although the primary key is not required, most tables typically have one.

In MongoDB, the primary key is required and implemented through the _id field of each document. If the _id value is not provided when adding a document to a collection, MongoDB automatically generates one. A document cannot be added to a MongoDB collection without the _id primary key.

You can define indexes on both relational tables and MongoDB collections. In a relational database, you can define an index on one or more columns in a table. MongoDB can define an index on one or more fields in a collection. In both cases, indexes are used to help improve query performance, although they need to be used judiciously to avoid unnecessary overhead.

Creating your database environment

When working with a relational database, you typical use some form of SQL to manage your databases and data, as well as query your data. This is similar to how you use MQL in MongoDB. The rest of the article demonstrates how to use the two languages to perform similar operations, such as creating objects or retrieving data. For each example, I’ll first show you how to use SQL to carry out a specific operation in MySQL and then show you how to use MQL to carry out a similar operation in MongoDB.

We’ll start with creating and populating a simple database in MySQL. The following SQL statements create a database name hr, switch the context to that database, and then add three tables to the database:

I don’t plan to spend a great deal of time explaining how each SQL statement works. I’m assuming that, if you’re interesting in moving from a RDBMS to MongoDB, you already know how to use SQL. In case you should have forgotten something, however, you can find details about the SQL statements in the MySQL Reference Manual. As for the MQL statements, I’ve already described those in the earlier articles in this series.

All that said, I do want to point out a few things about these statements. First, notice that the code starts with a CREATE DATABASE statement that creates the hr database. This is followed by a USE statement that switches the database context to hr, after the database has been created.

Next come the table definitions for the positions, departments, and employees tables. Each one defines the columns that will be added to the table and designates one of those columns as the primary key. The primary key is configured with the AUTO_INCREMENT attribute, which is set to a specific starting number. In addition, the employees table includes two foreign keys: one that references the positions table and one that references the departments table.

After the tables have been created, we can then run the following INSERT statements to populate them with sample data:

Each INSERT statement specifies the names of specific columns in the target table and then provides the rows of data to be inserted into those columns. The primary key column is not included in the list of names because that data is generated automatically.

Now let’s look at how to do something similar in MongoDB. In this case, you don’t need to specifically create a database. Instead, you run a use command that changes the context to the hr database. Once you’ve switched context, you can then create the collection. MongoDB does not create the actual database until you add the first collection.

To get started, run the following use command to change the context to the hr database:

You can then use the createCollection method available to the database object to create the employees collection if it doesn’t already exist, as in the following command:

When you create a collection, you don’t define any schema. You’re simply creating a logical container for the documents that will be added to the collection. Note, however, that you don’t need to create the collection at all. If you try to insert documents into a nonexistent collection, MongoDB will create the collection automatically.

To add documents to a MongoDB collection, you can use the insertOne method to add a single document or use the insertMany method to add multiple documents. For example, the following insertMany command adds 10 documents to the employees collection:

The documents contain the same data as what was added to the hr database in MySQL, so all the documents have the same structure. However, you could have included different fields, arrays and other embedded documents, or fields with the same name but defined with different types.

When working with a RDBMS, you must first create the database, add strictly defined tables to the database, and then add data that conforms to the table schema. In MongoDB, you need only switch the database context and add the documents to the collection, whether or not the collection already exists. The documents themselves define the schema, rather than being defined when creating the collection.

Querying data in your database

One of the most important functions of any database management system is to ensure that the data is readily available so users and applications can query it when they need it, assuming they have the proper permissions. In this way, they can retrieve exactly the data they want and control how the data is returned.

To demonstrate how this works in MySQL, let’s start with a simple SELECT statement. The following example retrieves all the rows and columns of data in the employees table:

The statement uses an asterisk in the SELECT clause to indicate that all columns should be returned. The statement also returns all rows because it has not been qualified in any other way, which is often done through the addition of a WHERE clause. The statement’s results are returned as a grid, similar to a spreadsheet.

The employees table includes two foreign keys that are defined on the position_id and dept_id columns. The foreign keys point to the positions and departments tables, respectively. Because they’re foreign keys, the columns contain only integer values that match the primary keys in the referenced tables. If you want to display the actual position and department names, you must join the three tables, as in the following example:

The statement uses inner joins that are based on the position_id and dept_id columns. Notice that the statement also uses table aliases when referencing the column names.

As in the previous SELECT statement, this one has not been qualified in any way (other than the joins), so it will return all columns and all rows from all three tables. Because inner joins are used, each row returned by the statement links the employee to the correct position and department.

In most cases, you’ll want to refine the statement further, which we’ll do as we progress through the examples. But first, let’s look at how to return all the data from the employees collection in MongoDB. For this, we can use a the find method available to the collection object, as in the following example:

The command returns all 10 of the documents in the employees collection and returns them in their entirety. In this case, it is not necessary to link to another collection to retrieve additional information about each employee because the documents in this collection are complete in themselves. As with the preceding MySQL statement, we’ll be refining this command as we progress through the examples in order to limit their results.

Now let’s return to our MySQL examples. This time around, we’ll create a SELECT statement that limits the columns we want to include in the results. The following example returns only the emp_id, emp_name, and job_rating columns from the employees table and the job_title column from the positions table:

The statement limits the results to specific columns by listing them in the SELECT clause. The statement still returns all rows in the table, but only the data in the specified columns.

To achieve similar results in MongoDB, we can refine our find command by specifying the target fields in the method’s second argument, as in the following example:

In this case, the method takes two arguments. The first argument, which is referred to as the query, determines which documents to return. If the argument is an empty set of curly brackets, as in this case, all documents are returned.

The second argument is referred to as the projection. It determines which fields to return by listing the fields and assigning each one a value of 1. The argument does not include the _id field because it is returned by default. If you don’t want to include the _id field, you must explicitly exclude it in the argument by specifying 0 as the field’s value, rather than 1.

Now let’s refine our queries even further. First, we’ll update the MySQL statement by adding a WHERE clause that limits the returned rows to those with a dept_name value of R&D, as in the following example:

The SELECT statement should now return only four rows, with each row representing an employee in the R&D department.

To achieve similar results in MongoDB, we can modify the find command so the query argument specifies the position.dept_name field and R&D value in the first argument, as in the following example

Now the command returns only the expected four documents, each with a dept_name field value of R&D.

We can even sort our results in both MySQL and MongoDB. To do this in MySQL, we would use an ORDER BY clause. For example, the following SELECT statement uses the clause to sort the results in descending order, based on the values in the job_rating column:

To sort the documents in MongoDB data, we can add the sort method to our command, basing the sort operation on the job_rating field, as in the following example:

When you include the sort method, you need to specify a sort order as the field value, using 1 for ascending order and -1 for descending order.

Aggregating data in your database

Another operation we can carry out in both MySQL and MongoDB is to group and aggregate data. In MySQL, we can achieve this by adding a GROUP BY clause to the SELECT statement and using an aggregate function in the SELECT clause. For example, the following SELECT statement groups the data by the dept_name column and then calculates the average job_rating value for each group:

Notice that the WHERE clause includes two search conditions. Together, they specify that the results should include only those rows whose dept_name value is R&D or IT. In addition, the statement uses the AVG function in the SELECT clause to find the average job_rating values for R&D and IT. It also uses the ROUND function to round that average to one decimal point.

To achieve similar results in MongoDB, we need to use the aggregate method rather than the find method. The method’s arguments should include the aggregation stages necessary to filter and group the data, as in the following example:

The aggregate method defines three stages:

  • The $match stage limits to the returned documents to those whose position.dept_name value is either R&D or IT.
  • The $group stage groups those documents by the dept_name field and finds the average job_rating value for each group.
  • The $project stage rounds the avg_rating field returned by the $group stage and renames the _id field, which is based on the dept_name field.

The command returns two documents, each providing the avg_rating value for one of the two departments.

Updating data in your database

In addition to querying data, it’s also important to be able to update the data in your database, something you can easily achieve with both SQL and MQL. For example, the following SQL statement updates the employees table in MySQL by changing the hire_date value to 2022-09-21:

Because the WHERE clause is included, only the row with an emp_id value of 1006 will be updated. Without the WHERE clause, the statement would update every row in the table. The WHERE clause ensures that you’re limiting your changes to those rows that satisfy the search condition.

You can achieve similar results in MongoDB by running the following updateOne command:

In this case, the updateOne method takes two arguments. The first one specifies which document to update based on the _id value, and the second one uses the $set operator to set the hire_date value to the new date.

The _id value I’ve used here is the one that MongoDB generated automatically when I added the document to the collection. If you want to run this command, you will need to first retrieve that value on your system. Because the _id value is always unique, only one document can be updated, which is why the updateOne method is used.

At times, you might want to update multiple records in your database. For example, the following SELECT statement updates several rows in the employees table:

The WHERE clause includes two search conditions that must both evaluate to true for a row to be returned. The first one specifies that the dept_name value must be Marketing, and the second condition specifies that the current_emp column should be 1. The statement joins the employees table to the departments table, making it possible to reference the dept_name column in the WHERE clause.

If both conditions are met, MySQL will increase the job_rating value by 1 for each row returned by the WHERE clause. To achieve the same logic in MongoDB, you need to use an updateMany command, as in the following example:

The first argument for the updateMany method specifies that the position.dept_name field value should equal Marketing and that the current_emp value should be true. The second argument uses the $inc operator to increment the job_rating value by 1, but only for those documents that meet the two conditions specified in the first argument.

Deleting data in your database

Deleting documents in MySQL and MongoDB works much like updating documents. For example, you can use the following DELETE statement in MySQL to delete the row with an emp_id value of 1009:

To delete the same document in MongoDB (the document with the emp_name value of Jessie), you would use a deleteOne command and specify the document’s _id value, as shown in the following example:

Once again, the _id value I’ve used here is the one that MongoDB generated automatically. If you want to run this command, you will need to first retrieve that value on your system.

Now let’s look how to delete multiple records, without deleting all records. To accomplish this, you must ensure that your query targets only the records you want to remove and nothing more. For example, you might decide to remove all records for employees who are part of the Marketing department, but leave all other employee records in place. You can achieve this in MySQL by running the following DELETE statement:

Because the dept_name column is not in the employees table, you need to join the table to the departments table so you can reference the dept_name column in the WHERE clause. When joining tables in this way, you must reference the target table in both the DELETE clause and the FROM clause to ensure that there is no ambiguity about which rows should be deleted. If you’re using table aliases, as I’ve done here, you should use the alias in the DELETE clause.

In MongoDB, we can achieve similar results by using the deleteMany method, as in the following example:

In this case, the deleteMany method includes only one argument, which specifies that the position.dept_name value must equal Marketing. Only those documents that meet this condition will be deleted.

Altering structures in your database

I also want to show you a couple other differences between SQL and MQL, particularly when it comes to altering schema. As before, we’ll start with MySQL. The following SQL statements add the username column to the employees table and then populate the column by concatenating the emp_name and emp_id fields:

To concatenate the two fields, the UPDATE statement uses the CONCAT operator, along with the CONVERT operator to convert the emp_id value to a string.

We can also update the structure of MongoDB documents by using an updateMany command, as shown in the following example:

In this case, the updateMany method takes two arguments. The first argument is an empty set of curly brackets, which means that all documents in the collection should be the updated. The second argument uses the $set operator to set the value of the username field. Because the field does not exist, MongoDB will create it automatically.

The second argument also includes the $concat operator, which sets the field’s value by concatenating the emp_name field and the last four digits in the _id field. To capture the four digits, the expression uses the $toString operator to convert the _id value to a string, and then uses the $substr operator to return the last for digits.

The goal of all this was to create a username value that, for our purposes here, is unique for each record. In a real-world setting, you’ll likely take a more sophisticated approach to generating unique usernames for employees, whether working in MongoDB or MySQL. For this article, I wanted to keep things simple.

You can also create indexes in both MySQL and MongoDB. For example, you can use the following CREATE INDEX statement in MySQL to create a unique index named ix_username on the username column:

To create an index on the username field in MongoDB, you can use the following createIndex command:

The createIndex method takes two arguments. The first argument identifies the username field as the field on which the index should be based. The argument also includes a value of 1, which indicates that the index should be sorted in ascending order. The method’s second argument names the index ix_username and specifies that it should be a unique index.

You can also drop indexes in MySQL and MongoDB. To drop the ix_username index in MySQL, you can run the following DROP INDEX statement:

In MongoDB, you can run a dropIndex command to remove the index, as in the following example:

If you want to drop the column you just created in the MySQL database, you can again use the ALTER TABLE statement:

To drop the username field in MongoDB, you can use an updateMany command, like you did when adding the field, only this time, you should use the $unset operator, as in the following example:

The first argument in the updateMany method is an empty set of curly brackets, indicating that all documents in the collection should be updated. The second argument uses the $unset operator to remove the username field, specifying an empty string for the field value.

Getting started with your move to MongoDB

In this article, I’ve tried to provide you with an overview of how SQL statements compare to MQL commands when carrying out similar operations. Although the examples I’ve provided are fairly basic, they should help you conceptualize some of the differences between the two approaches and give you a better sense of how to carry out a task in MQL that you’re familiar with SQL.

There is, of course, no substitute for digging into MongoDB on your own and learning how to perform different operations. At the same time, you should be careful not to get so locked into the SQL-MQL comparisons that you lose sight of the fact that a RDBMS and MongoDB are two very different approaches to managing and accessing data. However, the information in this article should at least provide you with one more tool for better understanding how MongoDB works so you can interact with it as comfortably as MySQL, PostgreSQL, Oracle, SQL Server, or whatever RDBMS you’ve been using.

Article tags

Load comments

About the author

Robert Sheldon

See Profile

Robert is a freelance technology writer based in the Pacific Northwest. He’s worked as a technical consultant and has written hundreds of articles about technology for both print and online publications, with topics ranging from predictive analytics to 5D storage to the dark web. He’s also contributed to over a dozen books on technology, developed courseware for Microsoft’s training program, and served as a developmental editor on Microsoft certification exams. When not writing about technology, he’s working on a novel or venturing out into the spectacular Northwest woods.